SUPPORT / SAMPLES & SAS NOTES
 

Support

Problem Note 54476: DBCS characters from SAS® Content Categorization code are incorrectly processed when used by the Text Rule Builder node

DetailsAboutRate It

DBCS characters from SAS Content Categorization code are incorrectly processed when used by the Text Rule Builder node in SAS® Text Miner. The node tries to read the characters using UTF-8 encoding. However, the underlying tgcode.txt file is built and saved using the session encoding, which might be different from UTF-8. This issue occurs for languages that contain non-latin1 characters.

If the data set and corresponding code are saved using the UTF-8 encoding session, then the Text Rule Builder node uses the code correctly.



Operating System and Release Information

Product FamilyProductSystemProduct ReleaseSAS Release
ReportedFixed*ReportedFixed*
SAS SystemSAS Text MinerMicrosoft® Windows® for x645.1_M112.39.3 TS1M19.4 TS1M0
Microsoft Windows Server 2008 R25.1_M112.39.3 TS1M19.4 TS1M0
Microsoft Windows Server 2008 for x645.1_M112.39.3 TS1M19.4 TS1M0
Windows 7 Enterprise x645.1_M112.39.3 TS1M19.4 TS1M0
Windows 7 Professional x645.1_M112.39.3 TS1M19.4 TS1M0
64-bit Enabled AIX5.1_M112.39.3 TS1M19.4 TS1M0
64-bit Enabled Solaris5.1_M112.39.3 TS1M19.4 TS1M0
HP-UX IPF5.1_M112.39.3 TS1M19.4 TS1M0
Linux for x645.1_M112.39.3 TS1M19.4 TS1M0
Solaris for x645.1_M112.39.3 TS1M19.4 TS1M0
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.